Mixed Models 2
Humboldt-Universität zu Berlin
2024-01-12
Today we will learn…
df_biondo <-
read_csv(here("data", "Biondo.Soilemezidi.Mancini_dataset_ET.csv"),
locale = locale(encoding = "Latin1") ## for special characters in Spanish
) |>
clean_names() |>
mutate(gramm = ifelse(gramm == "0", "ungramm", "gramm")) |>
mutate_if(is.character,as_factor) |> # all character variables as factors
droplevels() |>
filter(adv_type == "Deic")\[\begin{align} fp_i &= \beta_0 + \beta_{verb\_t}x_i + \beta_{gramm}x_i + e_i \label{eq-fixed_effects} \end{align}\]
verb_t) and grammaticality (gramm)
\[\begin{align} fp_i &= \beta_0 + \alpha_{j[i]} + \alpha_{k[i]}+ \beta_{verb\_t}x_i + \beta_{gramm}x_i + e_i \label{eq-random_intercepts} \end{align}\]
Missing values and subsetting conditions
N.B., because we subsetted the data to include only adv_type == "Deic", each participant did not contribute 96 data points to our current dataset, but 64.
So, our overall N observations should be 64*60, minus however many missing observations we have + so \(i\) in \(fp_i\) has a value of 1:3840, minus missing values.
We can use the nobs() function to find out the number of observations in a model. For example our random-intercepts only model from the last class had 3795 observations, meaning we had 3840 - 3795 = 45 missing observations. This amounts to 1.17% or trials, which is fine (something around 5% of trials is not out of the ordinary).
Why do we have missing values? This can depend on a lot of things, such as incorrect attention-check responses (not relevant for this data), measurement error, or pre-processing steps (likely the cause for this data, which is eye-tracking during reading).
Groups Name Std.Dev.
Word (Intercept) 38.201
Subject (Intercept) 91.004
Residual 127.258
lmer() syntax?
rt for reaction time Groups Name Std.Dev.
Word (Intercept) 38.201
Subject (Intercept) 91.004
Residual 127.258
Estimate Std. Error t value
(Intercept) 619.6160 20.761039 29.845135
NativeLanguage1 106.2954 40.622552 2.616659
freq_c -29.4397 4.168753 -7.061990
Groups Name Std.Dev.
Word (Intercept) 38.201
Subject (Intercept) 91.004
Residual 127.258
Figure 1: lattice::dotplot(ranef(model))
(Intercept) verb_t1 gramm1 verb_t1:gramm1
5.95640363 0.06189237 0.00321152 -0.01431578
item (Intercept) verb_t1 gramm1 verb_t1:gramm1
1 1 6.022184 0.06189237 0.00321152 -0.01431578
2 2 5.761268 0.06189237 0.00321152 -0.01431578
3 3 5.854873 0.06189237 0.00321152 -0.01431578
4 4 6.056862 0.06189237 0.00321152 -0.01431578
5 5 6.138213 0.06189237 0.00321152 -0.01431578
6 6 6.331058 0.06189237 0.00321152 -0.01431578
A lot of people construct random intercept-only models but conceptually, it makes hella sense to include random slopes most of the time. After all, you can almost always expect that people differ with how they react to an experimental manipulation!
— Winter (2014), p. 17
\[\begin{align} fp_i &= \beta_0 + \alpha_{j[i]} + \alpha_{k[i]} + \beta_{verb\_t}x_i + (\beta_{gramm} + \gamma_{j[i]})x_i + e_i \label{eq-random_slopes} \end{align}\]
gramm for participant \(i\)
Figure 2: Mean effects by-participant overall (A) and per condition (B) with population-level effects in black, with the same plots by-item (C and D)
Model building
Today we are exploring the random effects of our model by adding and subtracting random slopes to ‘see what happens’. You typically would NOT do this!
Generally, you would start with a pre-defined random effects structure justified by your experimental design and theory (your “maximal” model (Barr et al., 2013)). We will get into model selection in the next (and last) session. Today we will be adding and removing varying slopes willy-nilly, which can amount to p-hacking, data dredging, or HARKing (Hypohtesisng After the Results are Known).
Groups Name Std.Dev.
item (Intercept) 0.13929
sj (Intercept) 0.25795
Residual 0.40111
+ gramm to (1|sj)
1) per participant (|sj)…”1) and tense slopes (+ verb_t) per item (|item) ”Linear mixed model fit by REML. t-tests use Satterthwaite's method [
lmerModLmerTest]
Formula: log(fp) ~ verb_t * gramm + (1 | sj) + (1 + verb_t | item)
Data: df_biondo
Subset: roi == 4
REML criterion at convergence: 4216.2
Scaled residuals:
Min 1Q Median 3Q Max
-4.1758 -0.6096 -0.0227 0.6060 4.0568
Random effects:
Groups Name Variance Std.Dev. Corr
item (Intercept) 0.019424 0.13937
verb_t1 0.002513 0.05012 0.54
sj (Intercept) 0.066414 0.25771
Residual 0.160252 0.40032
Number of obs: 3795, groups: item, 96; sj, 60
Fixed effects:
Estimate Std. Error df t value Pr(>|t|)
(Intercept) 5.956384 0.036763 79.249351 162.023 < 0.0000000000000002
verb_t1 0.061733 0.013970 93.398427 4.419 0.0000267
gramm1 0.003298 0.012999 3544.431926 0.254 0.80
verb_t1:gramm1 -0.014380 0.025998 3544.742546 -0.553 0.58
(Intercept) ***
verb_t1 ***
gramm1
verb_t1:gramm1
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Correlation of Fixed Effects:
(Intr) vrb_t1 gramm1
verb_t1 0.077
gramm1 0.000 -0.002
vrb_t1:grm1 0.000 0.002 0.000
Random intercept only
Estimate Std. Error df t value Pr(>|t|)
(Intercept) 5.95640 0.03679 79.20081 161.90252 0.00000
verb_t1 0.06189 0.01303 3637.13315 4.75172 0.00000
gramm1 0.00321 0.01302 3637.18338 0.24657 0.80526
verb_t1:gramm1 -0.01432 0.02605 3637.10235 -0.54956 0.58265
Random intercept and slope
Estimate Std. Error df t value Pr(>|t|)
(Intercept) 5.95638 0.03676 79.24935 162.02259 0.00000
verb_t1 0.06173 0.01397 93.39843 4.41890 0.00003
gramm1 0.00330 0.01300 3544.43193 0.25367 0.79977
verb_t1:gramm1 -0.01438 0.02600 3544.74255 -0.55312 0.58021
fit_fp_item has changed
Groups Name Std.Dev.
item (Intercept) 0.13929
sj (Intercept) 0.25795
Residual 0.40111
item group: verb_t1
0.05)Corr
0.54
lattice::dotplot(): what do these plots tell us?Figure 3: By-item varying intercepts and slopes (A), by-participant varying intercepts (B)
coef()
ranef to get their deviances from the population-level intercept/slopeFigure 4: Correlation of by-participant gramm1 slopes (x-axis) and intercepts (y-axis)
Data: df_biondo
Subset: roi == 4
Models:
fit_fp_1: log(fp) ~ verb_t * gramm + (1 | sj) + (1 | item)
fit_fp_item: log(fp) ~ verb_t * gramm + (1 | sj) + (1 + verb_t | item)
npar AIC BIC logLik deviance Chisq Df Pr(>Chisq)
fit_fp_1 7 4210.3 4254.0 -2098.2 4196.3
fit_fp_item 9 4210.4 4266.5 -2096.2 4192.4 3.9796 2 0.1367
fit_fp_item
+ verb_t to (1|sj)
boundary (singular) fit: see help('isSingular')
boundary (singular) fit: see help('isSingular')
help('isSingular') in the Console and see what you find Estimate Std. Error df t value Pr(>|t|)
(Intercept) 5.95638 0.03676 79.24935 162.02259 0.00000
verb_t1 0.06173 0.01397 93.39843 4.41890 0.00003
gramm1 0.00330 0.01300 3544.43193 0.25367 0.79977
verb_t1:gramm1 -0.01438 0.02600 3544.74255 -0.55312 0.58021
Estimate Std. Error df t value Pr(>|t|)
(Intercept) 5.95641 0.03676 79.17933 162.05485 0.00000
verb_t1 0.06173 0.01415 91.56777 4.36365 0.00003
gramm1 0.00329 0.01300 3544.45970 0.25349 0.79990
verb_t1:gramm1 -0.01434 0.02599 3544.77121 -0.55150 0.58133
Groups Name Std.Dev. Corr
item (Intercept) 0.139371
verb_t1 0.050125 0.542
sj (Intercept) 0.257710
Residual 0.400315
Figure 5: By-item (A) and by-participant (B) varying intercepts and slopes
fit_fp_item
A linear-mixed model was fit to log-transformed first-pass reading times at the verb region with grammaticality, tense, and their interaction as fixed effects, and by-participant intercepts and by-item varying intercepts and tense slopes. Tense and grammaticality were sum contrast coded (past and grammatical = -0.5, future and ungrammatical = 0.5).
Today we learned…
| Term | Definition | Equation/Code |
|---|---|---|
| linear mixed (effects) model | NA | NA |